19 research outputs found

    Emergence of sensory selection mechanisms in Artificial Life simulations

    Get PDF
    Background\ud The evolutionary advantages of selective attention are unclear. Since the study of selective attention began, it has been suggested that the nervous system only processes the most relevant stimuli because of its limited capacity [1]. An alternative proposal is that action planning requires the inhibition of irrelevant stimuli, which forces the nervous system to limit its processing [2]. An evolutionary approach might provide additional clues to clarify the role of selective attention.\ud \ud Methods\ud We developed Artificial Life simulations wherein animals were repeatedly presented two objects, "left" and "right", each of which could be "food" or "non-food." The animals' neural networks (multilayer perceptrons) had two input nodes, one for each object, and two output nodes to determine if the animal ate each of the objects. The neural networks also had a variable number of hidden nodes, which determined whether or not it had enough capacity to process both stimuli (Table 1). The evolutionary relevance of the left and the right food objects could also vary depending on how much the animal's fitness was increased when ingesting them (Table 1). We compared sensory processing in animals with or without limited capacity, which evolved in simulations in which the objects had the same or different relevances.\ud \ud Table 1. Nine sets of simulations were performed, varying the values of food objects and the number of hidden nodes in the neural networks. The values of left and right food were swapped during the second half of the simulations. Non-food objects were always worth -3.\ud The evolution of neural networks was simulated by a simple genetic algorithm. Fitness was a function of the number of food and non-food objects each animal ate and the chromosomes determined the node biases and synaptic weights. During each simulation, 10 populations of 20 individuals each evolved in parallel for 20,000 generations, then the relevance of food objects was swapped and the simulation was run again for another 20,000 generations. The neural networks were evaluated by their ability to identify the two objects correctly. The detectability (d') for the left and the right objects was calculated using Signal Detection Theory [3].\ud \ud Results and conclusion\ud When both stimuli were equally relevant, networks with two hidden nodes only processed one stimulus and ignored the other. With four or eight hidden nodes, they could correctly identify both stimuli. When the stimuli had different relevances, the d' for the most relevant stimulus was higher than the d' for the least relevant stimulus, even when the networks had four or eight hidden nodes. We conclude that selection mechanisms arose in our simulations depending not only on the size of the neuron networks but also on the stimuli's relevance for action

    A Simple Artificial Life Model Explains Irrational Behavior in Human Decision-Making

    Get PDF
    Although praised for their rationality, humans often make poor decisions, even in simple situations. In the repeated binary choice experiment, an individual has to choose repeatedly between the same two alternatives, where a reward is assigned to one of them with fixed probability. The optimal strategy is to perseverate with choosing the alternative with the best expected return. Whereas many species perseverate, humans tend to match the frequencies of their choices to the frequencies of the alternatives, a sub-optimal strategy known as probability matching. Our goal was to find the primary cognitive constraints under which a set of simple evolutionary rules can lead to such contrasting behaviors. We simulated the evolution of artificial populations, wherein the fitness of each animat (artificial animal) depended on its ability to predict the next element of a sequence made up of a repeating binary string of varying size. When the string was short relative to the animats’ neural capacity, they could learn it and correctly predict the next element of the sequence. When it was long, they could not learn it, turning to the next best option: to perseverate. Animats from the last generation then performed the task of predicting the next element of a non-periodical binary sequence. We found that, whereas animats with smaller neural capacity kept perseverating with the best alternative as before, animats with larger neural capacity, which had previously been able to learn the pattern of repeating strings, adopted probability matching, being outperformed by the perseverating animats. Our results demonstrate how the ability to make predictions in an environment endowed with regular patterns may lead to probability matching under less structured conditions. They point to probability matching as a likely by-product of adaptive cognitive strategies that were crucial in human evolution, but may lead to sub-optimal performances in other environments

    Computational and psychophysical approach to attentional allocation and decision making.

    No full text
    O processo evolutivo deixa vieses no sistema nervoso de forma a optimizar nossas capacidades cognitivas para o ambiente em que evoluímos. Nosso objetivo é criar modelos de vida artificial nos quais a atenção seletiva, a tomada de decisão em sequências binárias e o tempo de reação ao aparecimento abrupto de um alvo precedido por pista emerjam como consequência da evolução. Em nossos experimentos, a atenção seletiva enviesava o processamento de estímulos de forma a dar prioridade aos mais relevantes quando eles tinham relevâncias diferentes. Nossos experimentos de tomada de decisão apóiam a teoria de que o pareamento de probabilidades, estratégia adotada por seres humanos neste tipo de experimento, é consequência da busca de padrões, que decorre da importância que isto teve durante a evolução humana. No estudo do tempo de reação, o comportamento observado em seres humanos só pôde ser modelado em populações de animais artificiais quando existia ruído e eles tinham que selecionar uma ação apropriada entre duas possíveis.The evolutionary process leaves biases in the nervous system so as to optimize our cognitive capacities to the environment where we evolved. Our objective is to create artificial life models wherein selective attention, decision making in binary sequences and reaction time to the abrupt appearance of a target preceded by a cue emerge as a consequence of evolution. In our experiments, selective attention biased stimuli processing so as to give priority to the most relevant stimuli when they had different relevances. Our decision making experiments support the theory that probability matching, the strategy adopted by humans in this kind of experiment, is a consequence of a search for patterns, which results from the importance that finding regularities in our environment had during human evolution. In the study of reaction time, the behavior observed in humans could only be modeled in populations of artificial animal when there was noise and they had to select an appropriate action between two possible ones

    A note on the analysis of two-stage task results: how changes in task structure affect what model-free and model-based strategies predict about the effects of reward and transition on the stay probability

    Get PDF
    Many studies that aim to detect model-free and model-based influences on behavior employ two-stage behavioral tasks of the type pioneered by Daw and colleagues in 2011. Such studies commonly modify existing two-stage decision paradigms in order to better address a given hypothesis, which is an important means of scientific progress. It is, however, critical to fully appreciate the impact of any modified or novel experimental design features on the expected results. Here, we use two concrete examples to demonstrate that relatively small changes in the two-stage task design can substantially change the pattern of actions taken by model-free and model-based agents. In the first, we show that, under specific conditions, computer simulations of purely model-free agents will produce the reward by transition interactions typically thought to characterize model-based behavior on a two-stage task. The second example shows that model-based agents' behavior is driven by a main effect of transition-type in addition to the canonical reward by transition interaction whenever the reward probabilities of the final states do not sum to one. Together, these examples emphasize the benefits of using computer simulations to determine what pattern of results to expect from both model-free and model-based agents performing a given two-stage decision task in order to design choice paradigms and analysis strategies best suited to the current question

    Humans are primarily model-based and not model-free learners in the two-stage task

    Get PDF
    Distinct model-free and model-based learning processes are thought to drive both typical and dysfunctional behaviors. Data from two-stage decision tasks have seemingly shown that human behavior is driven by both processes operating in parallel. However, in this study, we show that more detailed task instructions lead participants to make primarily model-based choices that show little, if any, model-free influence. We also demonstrate that behavior in the two-stage task may falsely appear to be driven by a combination of model-based/model-free learning if purely model-based agents form inaccurate models of the task because of misunderstandings. Furthermore, we found evidence that many participants do misunderstand the task in important ways. Overall, we argue that humans formulate a wide variety of learning models. Consequently, the simple dichotomy of model-free versus model-based learning is inadequate to explain behavior in the two-stage task and connections between reward learning, habit formation, and compulsivity

    Model-free reinforcement learning operates over information stored in working-memory to drive human choices

    Get PDF
    Model-free learning creates stimulus-response associations, but are there limits to the types of stimuli it can operate over? Most experiments on reward-learning have used discrete sensory stimuli, but there is no algorithmic reason to restrict model-free learning to external stimuli, and theories suggest that model-free processes may operate over highly abstract concepts and goals. Our study aimed to determine whether model-free learning can operate over environmental states defined by information held in working memory. We compared the data from human participants in two conditions that presented learning cues either simultaneously or as a temporal sequence that required working memory. There was a significant influence of model-free learning in the working memory condition. Moreover, both groups showed greater model-free effects than simulated model-based agents. Thus, we show that model-free learning processes operate not just in parallel, but also in cooperation with canonical executive functions such as working memory to support behavior

    Humans primarily use model-based inference in the two-stage task

    Full text link
    Distinct model-free and model-based learning processes are thought to drive both typical and dysfunctional behaviours. Data from two-stage decision tasks have seemingly shown that human behaviour is driven by both processes operating in parallel. However, in this study, we show that more detailed task instructions lead participants to make primarily model-based choices that have little, if any, simple model-free influence. We also demonstrate that behaviour in the two-stage task may falsely appear to be driven by a combination of simple model-free and model-based learning if purely model-based agents form inaccurate models of the task because of misconceptions. Furthermore, we report evidence that many participants do misconceive the task in important ways. Overall, we argue that humans formulate a wide variety of learning models. Consequently, the simple dichotomy of model-free versus model-based learning is inadequate to explain behaviour in the two-stage task and connections between reward learning, habit formation and compulsivity

    Scheme of a typical two-stage task.

    No full text
    <p>The thicker arrow indicates the common transition and the thinner arrow indicates the rare transition.</p

    Difference in stay probability for model-based agents.

    No full text
    <p>Differences between the sum of the stay probabilities for model-based agents following common versus rare transitions (i.e., the sum of the dark gray bars minus the sum of the light gray bars) as a function of the sum of the reward probabilities at the final state (<i>p</i> + <i>b</i>). This specific example plot was generated assuming that final state reward probabilities are equal (<i>p</i> = <i>b</i>) and that the exploration-exploitation parameter in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0195328#pone.0195328.e021" target="_blank">Eq 16</a> is <i>β</i> = 2.5. When computing the differences in stay probability on the y-axes, <i>P</i><sub><i>rc</i></sub> stands for the stay probability after a common transition and a reward, <i>P</i><sub><i>uc</i></sub> is the stay probability after a common transition and no reward, <i>P</i><sub><i>rr</i></sub> is the stay probability after a rare transition and a reward, and <i>P</i><sub><i>ur</i></sub> is the stay probability after a rare transition and no reward.</p
    corecore